home *** CD-ROM | disk | FTP | other *** search
- Common Questions and Answers
-
- MSCDEX
-
- MSCDEX (or the Microsoft compact disc extension) software standardizes
- the way in which all CD-ROM drives are accessed by a PC. On its
- release it could not fail to succeed as it was the only independent
- software which allowed the CD-ROM disc to appear as one big floppy
- disc to the end user. The following versions are likely to be in
- existence, still. If you are not up to date then contact your drive
- supplier to receive a new copy.
-
- Version 1.1: Supported reading high sierra group files.
- Version 2.0: Supported reading high sierra and ISO 9660 and
- supported standard audio functions
- Version 2.1: Support for interleaved Audio in CD-ROM XA, compatible
- with MS-DOS 4.0 and MS-NET compatible.
- Version 2.2: Supports MS-DOS version 5.0.
-
- High Sierra (HSG) and ISO 9660 formats
-
- The High Sierra Group (HSG) standard was the first attempt to lay down
- a formatting structure for a CD-ROM (much like the 1.4 Megabyte
- formatting standard for PC floppy disks). Its release allowed
- different developers of CD products to ensure that a wide as possible
- range of computer users could read their CDs.
-
- Later, a few small changes were made for the sake of efficiency and
- computer compatibility and the ISO 9660 standard was born.
-
- ISO 9660 access software (or drivers as they are known) are available
- for practically every computer platform - making CD-ROM the most
- platform-independent medium ever to exist.
-
- An ISO 9660 formatted disc can be put into a PC machine and a DIR
- command performed. In an Apple machine a desktop can be opened, on a
- Sun UNIX system a file list operation can be performed.
-
- Most PC-orientated CD-ROM discs are ISO 9660 these days.
-
-
-
- HFS and other proprietary formats
-
- A few inadequacies in the ISO 9660 format do exist. File names can
- only consist of characters A to Z and underscore. A dot is allowed as
- a name/extension separator. No allowance is made for other
- characters, including spaces, lower case letters or names greater than
- eight characters. Even worse, the standard does not easily
- accommodate file resources. This of course does not go down too well
- in the Apple world - for one thing you loose the pretty icons with ISO
- 9660 format.
-
- As an alternative the Apple systems can cope with the HFS
- (hierarchical file system) format. This is, basically, a direct
- sector for sector copy of an Apple system hard disk and is the most
- common format used in the Apple world. More recently a similar system
- has been used in the Sun/Unix world. VAX VMS systems also use a
- device format system.
-
- Nimbus Information Systems can manufacture CD-ROM discs with both ISO
- 9660 and Apple HFS images on the same disc, keeping both camps happy.
-
- Rock-Ridge
-
- To get around this problem of multiple standards the Rock-Ridge format
- is a form of ISO+. Whilst being very similar it allows extended name
- characters (much like the Commodore/CDTV version of ISO) and file
- resources. It also copes with mixed mode and form disc formats for
- the XA, CD-I and Bridge standards.
-
- How Big is a CD-ROM?
-
- New people to the world of CD are often confused by the numerous and
- varying quotations for the capacity of a CD-ROM. This is mostly due
- to the fact that CD is a time domain medium, size relates to length.
- Nimbus has the world's most accurate mastering lathe and can cut discs
- up to 79 minutes 35 seconds long. Allowing some overhead for
- directory structures, path tables, system areas and a little more
- besides, let's work on 79 minutes. The CD-ROM capacity is therefore:
-
-
- 79 minutes x 60 seconds x 75 blocks x 2,048 bytes
-
- or 728,064,000 bytes
- = 711,000 kilobytes
- = 694 megabytes
-
- Let that be it once and for all!
-
-
- Blocking factors and efficiency
-
- Having said that the capacity of a CD-ROM is 694 megabytes it may be
- less. This is due to the blocking structure of the disc. As on your
- hard disk, the smallest part of the disk that can be accessed is a
- sector. A one byte file on your hard disk will take up 512 bytes of
- storage. A one byte file on a CD-ROM will consume 2,048 bytes (the
- sector size or block). Thus if you have 1,000,000 one byte files you
- will not get them onto a CD-ROM even though they only total less than
- one megabyte! For this reason, efficient CD-ROMs have a small number
- of very large files instead of a large number of very small files.
-
- Audio digitization levels
-
- The main levels of audio digitization than can be used on a CD-ROM
- (other than proprietary file formats and MPC standards) are CD-Digital
- Audio (PCM, pulse code modulation) and the CD-I ADPCM (adaptive delta
- pulse code modulation). These standards allow the following audio
- capacities on a CD-ROM:
-
-
- fs res BW hours s/m Equiv
-
- CD-DA 44.1Khz 16 bit 20KHz 1 hr S DA
- Lev A 37.8Khz 8 bit 17Khz 2 hr S/4 hr M LP
- Lev B 37.8Khz 4 bit 17Khz 4 hr S/8 hr M FM
- Lev C 18.9Khz 4 bit 8.5Khz 8 hr S/16 hr M AM
-
- All about CD-audio
-
- CD-Audio, can play on the vast majority of installed players (only
- some of the older models didn't have the audio circuits included) and
- is easily controlled by simple software. Users must be installing
- their CD-ROM drive with an MSCDEX version 2.0 or greater (see section
- on MSCDEX). The two main problems with using this format to store
- audio information are that you can only play audio OR load data at any
- one time and you are limited to a maximum of one hour of stereo or two
- hours of mono sound. Playing audio whilst displaying images can only
- be done by:
-
- a) buffering images in memory before playing an audio sequence,
-
- b) pre installing images onto the users hard disk so recovering
- information from two media at once,
-
- c) head-whizzing - load an image, play audio, load an image,
- play audio etc...
-
- CD-Audio can be accessed in two ways through any standard CD-ROM drive
- with audio capabilities:
-
- a) by track. Note that the CD-ROM track must be track one,
- audio tracks will then be available as track 2 to 99.
-
- b) by time code. Note that a maximum resolution of 75 frames
- per second is accessible. In our experience you should aim
- for an accuracy between players of the same model of +/- 2
- frames and between players of different manufacturers +/- 4
- frames.
-
- In both cases you can choose to play left, right channels or stereo
- (or even mute which you can use for synchronisation).
-
- Level A, B and C ADPCM
-
- Level A ADPCM is no longer recognized as a usable standard and is
- unlikely to be supported in the future machines. Whilst the quality
- is quite high (roughly equivalent to that of LP disks) the potential
- gains in additional audio time was thought not to be worth pursuing.
-
- Level B ADPCM is commonly used in CD-I and CD-ROM XA material as a
- reasonable quality compression standard for music, roughly equivalent
- to the sound you would obtain of an FM radio station (but with no
- horrible DJs!).
-
- Level C ADPCM gives you a great quantity of audio but with a
- substantial reduction in quality and is usually only considered for
- speech - where a high bandwidth is not required.
-
- Level B and C decoding are supported by the CD-I and CD-ROM XA
- machines which have yet to penetrate the market to a large degree.
- Both levels of audio require that the disc be mastered in Mode 2, form
- 2 (see discussion of standards and modes) so the mastering house needs
- to know.
-
- The big advantage that ADPCM audio has over CD-DA is that the sound
- can be interleaved with data. Thus enabling your software to load
- images and text simultaneously whilst playing audio sections. This
- interleaving can also be performed between 'channels' of audio
- allowing the software to switch between multilingual tracks or
- different backing tracks 'on the fly'. Most players will also play up
- to four channels simultaneously allow great flexibility in the sound
- presented to the end user.
-
- Interleaving, how it works and what you loose
-
- Interleaving is at the expense of audio capacity and is sometimes
- quite tricky to work out. It is best considered by using an example:
-
- A sequence requires two different channels of audio in level B
- mono, one a music background track, one a descriptive voice over.
- What is the data rate available for simultaneous images?
-
- Level B mono audio will consume one eighth of an available second
- per second of CD playing time. Thus, two simultaneous channels
- will consume one quarter of a second per second playing time.
- Three quarters of a second per second is available for other
- data. Each second of the CD in Mode 2, Form 2 has 2,336 bytes
- times 75 blocks. So three quarters of 2,336 times 75 is 131,400
- bytes per second. If your average SVGA image takes up 100K that
- means you can pipe off 1.3 images per second whilst
- simultaneously playing one or both of the audio tracks.
-
- Making Efficient CD-ROMs
-
- System Elements
-
- In order that a developer can optimise a CD-ROM product it is not
- just a simple case of analysing the CD-ROM architecture. The
- performance of a CD-ROM is governed as much by the system
- elements of the target as by the medium. A developer must
- understand each element of such a system and optimise
- accordingly. We will address all of these elements below, ranging
- from the operation of the computer's operating system, the
- operating system extension, the device driver, the CD-ROM drive
- hardware and, of course the CD-ROM itself.
-
- We will approach this by analysing first the structure of a CD-
- ROM and its effect on performance with regard to the system.
-
- Directory Structures
-
- The logical sequence of events when your programme requests a
- file access has to be well understood before optimisation can
- occur. When a request is sent to open a file the operating system
- will first load the path table on the CD-ROM to find the location
- of the directory record. Next the directory record will be loaded
- to find the file location, then the byte offset is calculated to
- allow the operating system to finally read the relevant section
- of the datafile. It can be seen then that this can involve three
- disc accesses in order to read a file.
-
- All directory records and path tables are stored in 2K logical
- blocks. The optimum directory size, therefore is related to this
- block size such that a list of 40 files (this is roughly
- dependant on the length of the filenames ) will fill one block. A
- list of 42 files will cover two blocks rather inefficiently.
- Loading a list of 80 files will take as long (2/75 second
- minimum) as a list of 42 files and requires the same amount of
- sector cache ( 4K ) . Within a path table you can pack between
- 100 and 200 ( again depending on name length ) directory
- references.
-
- All is not lost however, both the path table and the directory
- records can be buffered in 'sector cache memory' in one
- operation, allowing subsequent operations to request this
- information rather than disc, that is, if you have defined enough
- memory. Allowing enough sector cache to cope with loading the
- path table is very efficient. All future requests for directory
- locations will then be handled within sector cache.
-
- Improvements in access time can be made by grouping files in a
- single directory that are opened together (or sequentially ).
- Your operating system will then reference the disc cache when
- opening a file rather than accessing the disc. It is also
- possible to perform a 'dummy access' to a file when your
- programme is sitting idle so that the next file access is
- prepared by already loading in the relevant information.
-
- Buffering and MSCDEX
-
- Using MSCDEX you can define the disc cache memory using the '/M'
- option. If your directory covers 10 sectors and you only define 5
- sector cache blocks then you will end up reloading the directory
- every time you want to open a file which can cause vast amounts
- of time to be expended hacking the CD-ROM disc. Define your
- sector caching to be the same as the largest directory record
- plus two sectors ( the last two are used by MSCDEX for its own
- purposes). A simple calculation is shown here that allows you to
- make a recommendation to your CD-ROM users.
-
- /M : (largest number of files in a directory / 40 ) +
- (number of directory files / 100 ) + 4
-
- As a general rule then, it is better to keep your directory
- sizes small and a multiple of 40 files and allow sufficient
- buffering to keep down the number of disc accesses .
-
- The last hint is 'do not rely on the operating system to do the
- file searching for you'. It is far better to group your data into
- a few big files and use your own indexing system to find the
- section of data that you need than using the operating system.
-
- File Structures
-
- It is possible to use logical block sizes of between 0.5K to 2K
- on a CD-ROM. Usually this is done when you have many files all
- less than 2K in size. You can for instance half the capacity of a
- CD-ROM disc if all your files are 1K in size and all of your
- logical blocks are 2K in size. Whatever is defined the CD-ROM
- hardware will address a 2K block minimum, so unless you can VERY
- reliably predict the adjacent files to be opened within the same
- block this method of data storage proves rather inefficient. It
- is far better to concatenate your small files into one large file
- and hold an index of block offsets.
-
- Drive Performance
-
- Never rely on obtaining the theoretical data transfer rate of
- 150K per second. Piping files through the .SYS driver , MSCDEX
- and MS-DOS will slow things down. This can be a major factor when
- addressing medium size files like images. The only real way
- around this problem is to bypass all of these operations and
- allow your software to address the .SYS driver direct. This can
- be an arduous task, most developers opt for file compression to
- cut down the image retrieval time
-
- We have mentioned already that the sector cache is very
- important when we are considering file access times. A few
- manufacturers have installed a considerable amount of cache
- memory within the CD-ROM drive hardware. This allows a request to
- read a path table from the computer to extract data rather than
- another access to the disc. This can result in considerable
- apparent speed increase in your CD-ROM product.
-
- BE WARNED, it is not unknown for a developer to create a CD-ROM
- programme that is optimised for a particular drive, only to find
- that it works 100 times slower on other manufacturers drives.
-
- As we are dealing with relatively slow hardware, keep the number
- of seeks to your data file to a minimum. Use indexing and hashing
- tables which let you approach your data in the mst direct manner.
-
- It is obviously inefficient to swing from one end of the disc to
- another and back again. try to do some access analysis on your
- database and store sequentially accessed data blocks within the
- same vicinity. Sometimes it is even better to duplicate some data
- in order that the seek distances are kept to a minimum at the
- expense of a little more disc estate.
-
- All drive manufacturers have their own seeking algorithms which
- effect how the drive performs (see plots attached) . There is
- usually little point in optimising your long seek jumps to medium
- seek jumps, you will gain relatively little improvement in speed.
- If you can optimise to short seek jumps, however, you may well
- get a great improvement in data accessing.
-
- Disc Quality
-
- It is a surprise to most people to find that CD quality can be
- related to the retrieval time. You should find no problems with
- reputable manufacturers , however a dusty or scratched will
- affect the retrieval time. A few of the CD-ROM drives in the
- market perform the final layer of error correction within the PC
- rather than the drive, this can result in long data retrieval
- times when a disc is giving a high error rate.The drive might
- also go back automatically to read the same block if error rates
- are high and in very bad cases no data will be read at all. Treat
- CD-ROMs as carefully as you would any other very high density
- storage medium and you will obtain good performance for the
- lifetime of the data and more.
-
- Apple Discs and HFS
-
- Most of the rules and comments made above can be also applied to
- Apple and HFS CD-ROMs. One important note, however is the purity
- of the data image. Apple discs can be made by mastering directly
- from a hard disk.
-
- If you have performed a number of file deletes and inserts you
- will have a poorly structured hard disk with sections of files
- dispersed over the hard disk surface. This rather messy structure
- will then be directly transferred to the CD-ROM which can then in
- very long access times. Take a fresh empty hard disc and copy
- your data, in the order that you want it to appear on the CD-ROM.
- This will result in a clean efficient image.
-
- Access Time Analysis Plots for two CD-ROM Drives
-
- Two drive speed plots showing access times versus data location.
- Note that the 'Drive A' plot is very linear and approaches the
- theoretical access time as distance increases. Drive A has a
- powerful head positioning mechanism and a good seek algorithm.
-
- The 'Drive B' plot is more typical of current CD-ROM drives. It
- shows that the difference between large access paths and medium
- access paths to be relatively small, optimise only for short
- access paths to gain a speed advantage.
-
- Drive A
-
- (starting block 0)
-
- End Access 200 400 600 800 1000 1200
- Block Time(ms)
-
- 0 330 *
- 27000 490 *
- 54000 540 *
- 81000 570 *
- 108000 610 *
- 135000 650 *
- 162000 690 *
- 189000 720 *
- 216000 770 *
- 243000 800 *
- 270000 840 *
- 297000 870 *
- 324000 910 *
-
- Drive B
-
- (starting block 0)
-
- End Access 200 400 600 800 1000 1200
- Block Time(ms)
-
- 0 320 *
- 27000 710 *
- 54000 820 *
- 81000 920 *
- 108000 990 *
- 135000 1040 *
- 162000 1100 *
- 189000 1150 *
- 216000 1190 *
- 243000 1240 *
- 270000 1290 *
- 297000 1330 *
- 324000 1360 *
-
-
-
- Reference: CD-ROM aneCDote from the Nimbus Information System
-